Skip to content

Py vernier aggregations#223

Merged
EdHone merged 12 commits intoMetOffice:mainfrom
mo-marqh:py-vernier-aggregations
Mar 13, 2026
Merged

Py vernier aggregations#223
EdHone merged 12 commits intoMetOffice:mainfrom
mo-marqh:py-vernier-aggregations

Conversation

@mo-marqh
Copy link
Member

@mo-marqh mo-marqh commented Mar 4, 2026

Description

a new class is added to the post-processing vernier python library, to enable aggregation across multiple runs, that are asserted to be the same, but where the members can be also accessed, removed, etc.

Linked issues

Closes # (issue)

related to #221

related to #229

Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How has this been tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • New tests have been added
  • Tests have been modified to accommodate this change

Checklist:

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes, for both debug and optimised builds

@andrewcoughtrie
Copy link
Collaborator

Something @oakleybrunt mentioned on the previous PR (and @mo-mglover has just mentioned as well) the spelling of calliper should have 2 ls in this repo, we have tried to be consistent with this and so should continue to do so. So if this can be corrected in this PR that would be great.

@github-actions github-actions bot added the cla-required The CLA has not yet been signed by the author of this PR - added by GA label Mar 5, 2026
@mo-marqh
Copy link
Member Author

mo-marqh commented Mar 5, 2026

Something @oakleybrunt mentioned on the previous PR (and @mo-mglover has just mentioned as well) the spelling of calliper should have 2 ls in this repo, we have tried to be consistent with this and so should continue to do so. So if this can be corrected in this PR that would be great.

agreed, see
f9ed767
@andrewcoughtrie

@mo-marqh mo-marqh force-pushed the py-vernier-aggregations branch from 3c9bb2a to f9ed767 Compare March 5, 2026 14:14
@github-actions github-actions bot added cla-signed The CLA has been signed as part of this PR - added by GA and removed cla-required The CLA has not yet been signed by the author of this PR - added by GA labels Mar 5, 2026
@mo-marqh
Copy link
Member Author

mo-marqh commented Mar 9, 2026

i have added tests for the new aggregation class

this is likely suitable for review now @EdHone

@EdHone
Copy link
Collaborator

EdHone commented Mar 9, 2026

The technical implementation seems sound to me all round.
In terms of the design, I'm not entirely sure what the new class is offering in terms of additional user functionality that doesn't already exist with the aggregate function (that admittedly isn't a part of the API but could be).

For the case where a user wants to group a set of runs that have been loaded in order to analyse them as one, it's simple to do with a script like this:

from vernier import VernierReader, aggregate

# Load data
run1_timers = VernierReader(Path(run1_file)).load()
run2_timers = VernierReader(Path(run2_file)).load()
run3_timers = VernierReader(Path(run3_file)).load()
run4_timers = VernierReader(Path(run4_file)).load()

# Analyse individual data
analyse_data(run1_timers)
analyse_data(run2_timers)
analyse_data(run3_timers)
analyse_data(run4_timers)

# Analyse aggregated data
aggregated_timers = aggregate[run1_timers, run2_timers, run3_timers, run4_timers]
analyse_data(aggregated_timers)

with the benefit that because the product of the aggregation is just a VernierData object, the same analysis functions can be used.

If i'm misunderstanding the scope of your change and the example I've given above isn't the same as what you're trying to achieve please let me know

Copy link
Collaborator

@EdHone EdHone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment above ^

@mo-marqh
Copy link
Member Author

In terms of the design, I'm not entirely sure what the new class is offering in terms of additional user functionality that doesn't already exist with the aggregate function (that admittedly isn't a part of the API but could be).

I think that the design issue here is important.

I think that having an object which aggregates whilst keeping the identities of the members separate is a valuable and different entity from an aggregation that loses the distinction between input units.

On reflection, I think that the aggregate function as defined would be better to be moved to a method on the VernierData class, to continue to enable collecting data from many single files (& perhaps renamed)

I think that the VernierDataAggregation class provides a richer interface, better supporting an aggregation by assertion, that is that two or more independent runs can be considered together, whilst a VernierData method that support managing data in different files from the same run can be adapted from the current function.

I'll have a look at putting such a change set into a further commit to this PR, for design and implementation consideration.

@mo-marqh
Copy link
Member Author

mo-marqh commented Mar 11, 2026

8408b39
implements the suggestion I made in #223 (comment)

this provides a consistent design, with aggregate now being a method on the VernierData class, and the different behaviour of an asserted aggregation being a new VernierDataAggregation class

@mo-marqh
Copy link
Member Author

mo-marqh commented Mar 11, 2026

this does leave a further potential confusion, with overuse of the term aggregate, so I have changed the name of the new class to VernierDataCollation
8b80cbd
This serves to disambiguate, and enhances the differential that a collation is used to bring together different pieces of written information so that the similarities and differences can be seen.

An aggregation, in constrast, is something formed by adding together several amounts or things

@mo-marqh
Copy link
Member Author

mo-marqh commented Mar 11, 2026

i think this is ready for re-review now, it would be helpful if
#231 could be reviewed and merged first, then this rebased to that, as #231 is a spelling consistency change which confuses the diff within the PR
@EdHone

update: #231 closed, rebased to main

@mo-marqh mo-marqh force-pushed the py-vernier-aggregations branch from 280d574 to 02acc93 Compare March 11, 2026 14:13
@github-actions github-actions bot added cla-modified The CLA has been modified as part of this PR - added by GA and removed cla-signed The CLA has been signed as part of this PR - added by GA labels Mar 11, 2026
@mo-marqh mo-marqh force-pushed the py-vernier-aggregations branch from 02acc93 to cad1247 Compare March 11, 2026 14:26
@github-actions github-actions bot removed the cla-modified The CLA has been modified as part of this PR - added by GA label Mar 11, 2026
@EdHone
Copy link
Collaborator

EdHone commented Mar 13, 2026

Really happy with these changes, I'll get into a full review soon. I think the potential to enforce consistency of interface between the VernierData and VernierDataCollation is an opportunity we shouldn't pass on, as this would make all user analysis scripts consistent

Copy link
Collaborator

@EdHone EdHone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few spelling issues to fix

Copy link
Collaborator

@EdHone EdHone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to use GitHub's magic to fix the spelling errors - now approved

@EdHone EdHone merged commit 85aed81 into MetOffice:main Mar 13, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants